Tagged mathematics in PDFs for accessibility and other purposes

نویسنده

  • Ross Moore
چکیده

PDF has been the preferred format for publishing mathematics for many years now. With changes to methods of delivery (i.e., electronic rather than predominantly paper) there need to be corresponding enhancements in the document format. Not least among these can be implicit legal obligations to satisfy Accessibility criteria. The answer developed for PDF is tagging of document structure and content types, as described in the PDF/UA Implementation Guide [4]. Wikipedia describes this as “not a separate file-format but simply a way to use PDF” [12], which when supported “reader software will be able to reliably reflow text onto small screens, provide powerful navigation options, transform text appearance, improve search engine functionality, aid in the selection and copying of text, and more” [12]. Academic publishers are starting to see these benefits and will doubtless soon require at least minimal tagging of online PDF documents for Accessibility purposes, in a similar way to how Accessibility tags have been incorporated into HTML. Here is a brief overview of work done by the author to incorporate full MathML tagging of mathematical content in documents produced primarily using the LTEX typesetting system. Since the publicly available TEX software was not written to support such tagging of document content, further software tools are also required. This includes using a modified version of pdfTEX, a self-developed Perl program, TEX to MathML conversion software, some standard Unix command-line utilities, and extensive use of self-written TEX and LTEX macros. As this work is a continuation of work presented at the CICM meetings in 2009 [5], we concentrate here mostly on the advancements made since then. This includes the ability to capture complete math-environments from a running LTEX job, to automatically invoke a conversion of the LTEX source of the particular piece of mathematics into Presentation MathML using whatever appropriate conversion software is available. Previously the MathML version needed to have been available independent from the LTEX source. Now this conversion can be done ‘on-the-fly’, using TEX4HT for example, before merging the MathML and LTEX descriptions of the same piece of mathematics into a new extended LTEX description incorporating macros to cause the generation of appropriate tagging and enrichment to satisfy Accessibility requirements. Such automatic conversion and merging can add significantly to the total running time for the whole job, so an indexing system has been developed which

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

PAYMA: A Tagged Corpus of Persian Named Entities

The goal in the named entity recognition task is to classify proper nouns of a piece of text into classes such as person, location, and organization. Named entity recognition is an important preprocessing step in many natural language processing tasks such as question-answering and summarization. Although many research studies have been conducted in this area in English and the state-of-the-art...

متن کامل

Accessible Instruction - Resources

Introduction to Accessible Education [2] Developing Courses [3] Writing a Course Syllabus [4] Creating Accessible Lectures [5] Using PowerPoint [6] Using Word Documents and/or PDFs [7] Microsoft Word Accessibility Video pt 1 [8] Microsoft Word Accessibility Video pt 2 [9] Evaluating Students and Giving Feedback [10] Using Microsoft Office Microsoft Office 2010 Accessibility Video [11] Microsoft...

متن کامل

Accessible Instruction - Resources

Introduction to Accessible Education [2] Developing Courses [3] Writing a Course Syllabus [4] Creating Accessible Lectures [5] Using PowerPoint [6] Using Word Documents and/or PDFs [7] Microsoft Word Accessibility Video pt 1 [8] Microsoft Word Accessibility Video pt 2 [9] Evaluating Students and Giving Feedback [10] Using Microsoft Office Microsoft Office 2010 Accessibility Video [11] Microsoft...

متن کامل

Accessible Instruction - Resources

Introduction to Accessible Education [2] Developing Courses [3] Writing a Course Syllabus [4] Creating Accessible Lectures [5] Using PowerPoint [6] Using Word Documents and/or PDFs [7] Microsoft Word Accessibility Video pt 1 [8] Microsoft Word Accessibility Video pt 2 [9] Evaluating Students and Giving Feedback [10] Using Microsoft Office Microsoft Office 2010 Accessibility Video [11] Microsoft...

متن کامل

Accessible Instruction - Resources

Introduction to Accessible Education [2] Developing Courses [3] Writing a Course Syllabus [4] Creating Accessible Lectures [5] Using PowerPoint [6] Using Word Documents and/or PDFs [7] Microsoft Word Accessibility Video pt 1 [8] Microsoft Word Accessibility Video pt 2 [9] Evaluating Students and Giving Feedback [10] Using Microsoft Office Microsoft Office 2010 Accessibility Video [11] Microsoft...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013